Kanpur
BayPrAnoMeta: Bayesian Proto-MAML for Few-Shot Industrial Image Anomaly Detection
Sarkar, Soham, Sen, Tanmay, Banerjee, Sayantan
Industrial image anomaly detection is a challenging problem owing to extreme class imbalance and the scarcity of labeled defective samples, particularly in few-shot settings. We propose BayPrAnoMeta, a Bayesian generalization of Proto-MAML for few-shot industrial image anomaly detection. Unlike existing Proto-MAML approaches that rely on deterministic class prototypes and distance-based adaptation, BayPrAnoMeta replaces prototypes with task-specific probabilistic normality models and performs inner-loop adaptation via a Bayesian posterior predictive likelihood. We model normal support embeddings with a Normal-Inverse-Wishart (NIW) prior, producing a Student-$t$ predictive distribution that enables uncertainty-aware, heavy-tailed anomaly scoring and is essential for robustness in extreme few-shot settings. We further extend BayPrAnoMeta to a federated meta-learning framework with supervised contrastive regularization for heterogeneous industrial clients and prove convergence to stationary points of the resulting nonconvex objective. Experiments on the MVTec AD benchmark demonstrate consistent and significant AUROC improvements over MAML, Proto-MAML, and PatchCore-based methods in few-shot anomaly detection settings.
- Asia > India > West Bengal > Kolkata (0.04)
- North America > United States > California > Santa Clara County > Stanford (0.04)
- Asia > India > Uttar Pradesh > Kanpur (0.04)
A Novel Approach to Tomato Harvesting Using a Hybrid Gripper with Semantic Segmentation and Keypoint Detection
Ansari, Shahid, Gohil, Mahendra Kumar, Maeda, Yusuke, Bhattacharya, Bishakh
Precision agriculture and smart farming are increasingly adopted to improve productivity, reduce input waste, and maintain high product quality under growing demand. These approaches integrate sensing, automation, and data-driven decision-making to improve crop yield and post-harvest quality (Gupta, Abdelsalam, Khorsandroo, and Mittal (2020)). In this context, autonomous robotic harvesting is a key enabling technology for horticulture, where labor shortages and high labor costs directly affect production and consistency. Despite progress in mechanization, many conventional harvesting methods (e.g., combine harvesters, reapers, and trunk shakers) are unsuitable for soft and delicate crops such as tomatoes and strawberries because large contact forces and impacts can bruise or damage the fruit (Cho, Iida, Suguri, Masuda, and Kurita (2014); Shojaei (2021)). Selective harvesting, where fruits are picked individually at the appropriate ripeness stage, is therefore preferred for high-value crops. However, selective harvesting remains challenging because a robot must (i) detect the target fruit under occlusion, (ii) estimate its pose and identify the pedicel cutting location, and (iii) execute grasping and detachment without damaging the fruit or plant. In real cultivation environments, tomatoes are often densely packed and partially occluded by leaves and branches, making perception and reliable manipulation difficult (Chen et al. (2015)). Consequently, integrated harvesting systems that combine compliant end-effectors, robust perception, and closed-loop control remain an active research topic (Comba, Gay, Piccarolo, and Ricauda Aimonino (2010); Ling, Zhao, Gong, Liu, and Wang (2019)). A wide range of end-effectors has been explored for harvesting and handling soft produce.
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.04)
- Asia > India > Uttar Pradesh > Kanpur (0.04)
- North America > United States (0.04)
- (2 more...)
Neural Audio Codecs for Prompt-Driven Universal Sound Separation
Banerjee, Adhiraj, Arora, Vipul
Text-guided sound separation supports flexible audio editing across media and assistive applications, but existing models like AudioSep are too compute-heavy for edge deployment. Neural audio codec (NAC) models such as CodecFormer and SDCodec are compute-efficient but limited to fixed-class separation. We introduce CodecSep, the first NAC-based model for on-device universal, text-driven separation. CodecSep combines DAC compression with a Transformer masker modulated by CLAP-derived FiLM parameters. Across six open-domain benchmarks under matched training/prompt protocols, \textbf{CodecSep} surpasses \textbf{AudioSep} in separation fidelity (SI-SDR) while remaining competitive in perceptual quality (ViSQOL) and matching or exceeding fixed-stem baselines (TDANet, CodecFormer, SDCodec). In code-stream deployments, it needs just 1.35~GMACs end-to-end -- approximately $54\times$ less compute ($25\times$ architecture-only) than spectrogram-domain separators like AudioSep -- while remaining fully bitstream-compatible.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Asia > India > Uttar Pradesh > Kanpur (0.04)
- Media (0.47)
- Leisure & Entertainment (0.47)
- Automobiles & Trucks (0.34)
Stacked Ensemble of Fine-Tuned CNNs for Knee Osteoarthritis Severity Grading
Gupta, Adarsh, Kaur, Japleen, Doshi, Tanvi, Sharma, Teena, Verma, Nishchal K., Vasikarla, Shantaram
Abstract--Knee Osteoarthritis (KOA) is a musculoskeletal condition that can cause significant limitations and impairments in daily activities, especially among older individuals. T o evaluate the severity of KOA, typically, X-ray images of the affected knee are analyzed, and a grade is assigned based on the Kellgren-Lawrence (KL) grading system, which classifies KOA severity into five levels, ranging from 0 to 4. This approach requires a high level of expertise and time and is susceptible to subjective interpretation, thereby introducing potential diagnostic inaccuracies. T o address this problem a stacked ensemble model of fine-tuned Convolutional Neural Networks (CNNs) was developed for two classification tasks: a binary classifier for detecting the presence of KOA, and a multiclass classifier for precise grading across the KL spectrum. The proposed stacked ensemble model consists of a diverse set of pre-trained architectures, including MobileNetV2, Y ou Only Look Once (YOLOv8), and DenseNet201 as base learners and Categorical Boosting (CatBoost) as the meta-learner . This proposed model had a balanced test accuracy of 73% in multiclass classification and 87.5% in binary classification, which is higher than previous works in extant literature. Knee Osteoarthritis (KOA) [1] is a degenerative musculoskeletal joint disease in which the knee cartilage breaks down over time.
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- (10 more...)
- North America > United States > California > San Diego County > La Jolla (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Asia > India > Uttar Pradesh > Kanpur (0.04)
- North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
- North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (3 more...)
- North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
- North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > India > Uttar Pradesh > Kanpur (0.04)
TathyaNyaya and FactLegalLlama: Advancing Factual Judgment Prediction and Explanation in the Indian Legal Context
Nigam, Shubham Kumar, Patnaik, Balaramamahanthi Deepak, Mishra, Shivam, Shallum, Noel, Ghosh, Kripabandhu, Bhattacharya, Arnab
In the landscape of Fact-based Judgment Prediction and Explanation (FJPE), reliance on factual data is essential for developing robust and realistic AI-driven decision-making tools. This paper introduces TathyaNyaya, the largest annotated dataset for FJPE tailored to the Indian legal context, encompassing judgments from the Supreme Court of India and various High Courts. Derived from the Hindi terms "Tathya" (fact) and "Nyaya" (justice), the TathyaNyaya dataset is uniquely designed to focus on factual statements rather than complete legal texts, reflecting real-world judicial processes where factual data drives outcomes. Complementing this dataset, we present FactLegalLlama, an instruction-tuned variant of the LLaMa-3-8B Large Language Model (LLM), optimized for generating high-quality explanations in FJPE tasks. Finetuned on the factual data in TathyaNyaya, FactLegalLlama integrates predictive accuracy with coherent, contextually relevant explanations, addressing the critical need for transparency and interpretability in AI-assisted legal systems. Our methodology combines transformers for binary judgment prediction with FactLegalLlama for explanation generation, creating a robust framework for advancing FJPE in the Indian legal domain. TathyaNyaya not only surpasses existing datasets in scale and diversity but also establishes a benchmark for building explainable AI systems in legal analysis. The findings underscore the importance of factual precision and domain-specific tuning in enhancing predictive performance and interpretability, positioning TathyaNyaya and FactLegalLlama as foundational resources for AI-assisted legal decision-making.
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Asia > Middle East > UAE > Dubai Emirate > Dubai (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- (8 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)
NyayaRAG: Realistic Legal Judgment Prediction with RAG under the Indian Common Law System
Nigam, Shubham Kumar, Patnaik, Balaramamahanthi Deepak, Mishra, Shivam, Thomas, Ajay Varghese, Shallum, Noel, Ghosh, Kripabandhu, Bhattacharya, Arnab
Legal Judgment Prediction (LJP) has emerged as a key area in AI for law, aiming to automate judicial outcome forecasting and enhance interpretability in legal reasoning. While previous approaches in the Indian context have relied on internal case content such as facts, issues, and reasoning, they often overlook a core element of common law systems, which is reliance on statutory provisions and judicial precedents. In this work, we propose NyayaRAG, a Retrieval-Augmented Generation (RAG) framework that simulates realistic courtroom scenarios by providing models with factual case descriptions, relevant legal statutes, and semantically retrieved prior cases. NyayaRAG evaluates the effectiveness of these combined inputs in predicting court decisions and generating legal explanations using a domain-specific pipeline tailored to the Indian legal system. We assess performance across various input configurations using both standard lexical and semantic metrics as well as LLM-based evaluators such as G-Eval. Our results show that augmenting factual inputs with structured legal knowledge significantly improves both predictive accuracy and explanation quality.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Asia > Middle East > UAE > Dubai Emirate > Dubai (0.04)
- (14 more...)
AGGRNet: Selective Feature Extraction and Aggregation for Enhanced Medical Image Classification
Makwe, Ansh, Agrawal, Akansh, Jain, Prateek, Agrawal, Akshan, Bagade, Priyanka
Medical image analysis for complex tasks such as severity grading and disease subtype classification poses significant challenges due to intricate and similar visual patterns among classes, scarcity of labeled data, and variability in expert interpretations. Despite the usefulness of existing attention-based models in capturing complex visual patterns for medical image classification, underlying architectures often face challenges in effectively distinguishing subtle classes since they struggle to capture inter-class similarity and intra-class variability, resulting in incorrect diagnosis. T o address this, we propose AGGRNet framework to extract informative and non-informative features to effectively understand fine-grained visual patterns and improve classification for complex medical image analysis tasks. Experimental results show that our model achieves state-of-the-art performance on various medical imaging datasets, with the best improvement up to 5% over SOTA models on the Kvasir dataset.
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- Asia > India > Uttar Pradesh > Kanpur (0.04)
- Research Report (1.00)
- Instructional Material > Online (0.61)
- Instructional Material > Course Syllabus & Notes (0.61)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision > Image Understanding (0.72)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)